Segmented Regression Estimators for Massive Data Sets
نویسندگان
چکیده
We describe two methodologies for obtaining segmented regression estimators from massive training data sets. The first methodology, called Linear Regression Tree (LRT), is used for continuous response variables, and the second and complementary methodology, called Naive Bayes Tree (NBT), is used for categorical response variables. These are implemented in the IBM ProbE (Probabilistic Estimation) data mining engine, which is an object-oriented framework for building classes of segmented predictive models from massive training data sets. Based on this methodology, an application called ATM-SETM for direct-mail targeted marketing has been developed jointly with Fingerhut Business Intelligence [1]).
منابع مشابه
Asymptotics and confidence estimation in segmented regression models
ASYMPTOTICS AND CONFIDENCE ESTIMATION IN SEGMENTED REGRESSION MODELS Rebekah Ann Robinson May 11, 2012 Standard regularity assumptions for regression models are not satisfied in segmented regression models with an unknown change point, and consequently standard asymptotic results and inferential methods for confidence estimation are not applicable. This dissertation considers a clustered segmen...
متن کاملFuzzy Robust Regression Analysis with Fuzzy Response Variable and Fuzzy Parameters Based on the Ranking of Fuzzy Sets
Robust regression is an appropriate alternative for ordinal regression when outliers exist in a given data set. If we have fuzzy observations, using ordinal regression methods can't model them; In this case, using fuzzy regression is a good method. When observations are fuzzy and there are outliers in the data sets, using robust fuzzy regression methods are appropriate alternatives....
متن کاملA comparison of estimators for regression models with change points
We consider two problems concerning locating change points in a linear regression model. One involves jump discontinuities (change-point) in a regression model and the other involves regression lines connected at unknown points. We compare four methods for estimating single or multiple change points in a regression model, when both the error variance and regression coefficients change simultane...
متن کاملPositive-Shrinkage and Pretest Estimation in Multiple Regression: A Monte Carlo Study with Applications
Consider a problem of predicting a response variable using a set of covariates in a linear regression model. If it is a priori known or suspected that a subset of the covariates do not significantly contribute to the overall fit of the model, a restricted model that excludes these covariates, may be sufficient. If, on the other hand, the subset provides useful information, shrinkage meth...
متن کاملPenalized Estimators in Cox Regression Model
The proportional hazard Cox regression models play a key role in analyzing censored survival data. We use penalized methods in high dimensional scenarios to achieve more efficient models. This article reviews the penalized Cox regression for some frequently used penalty functions. Analysis of medical data namely ”mgus2” confirms the penalized Cox regression performs better than the cox regressi...
متن کامل